Picture for Gim Hee Lee

Gim Hee Lee

HandMCM: Multi-modal Point Cloud-based Correspondence State Space Model for 3D Hand Pose Estimation

Add code
Feb 02, 2026
Viaarxiv icon

Segment Any Events with Language

Add code
Jan 30, 2026
Viaarxiv icon

D3D-VLP: Dynamic 3D Vision-Language-Planning Model for Embodied Grounding and Navigation

Add code
Dec 14, 2025
Viaarxiv icon

4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos

Add code
Nov 07, 2025
Viaarxiv icon

CLAIR: CLIP-Aided Weakly Supervised Zero-Shot Cross-Domain Image Retrieval

Add code
Aug 17, 2025
Viaarxiv icon

Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning

Add code
Jul 03, 2025
Viaarxiv icon

X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability

Add code
Jun 16, 2025
Figure 1 for X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
Figure 2 for X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
Figure 3 for X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
Figure 4 for X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
Viaarxiv icon

Event-Driven Dynamic Scene Depth Completion

Add code
May 19, 2025
Figure 1 for Event-Driven Dynamic Scene Depth Completion
Figure 2 for Event-Driven Dynamic Scene Depth Completion
Figure 3 for Event-Driven Dynamic Scene Depth Completion
Figure 4 for Event-Driven Dynamic Scene Depth Completion
Viaarxiv icon

LLaVA-4D: Embedding SpatioTemporal Prompt into LMMs for 4D Scene Understanding

Add code
May 18, 2025
Viaarxiv icon

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation

Add code
May 16, 2025
Figure 1 for Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Figure 2 for Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Figure 3 for Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Figure 4 for Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Viaarxiv icon